library(testthat)
library(htmltools)
library(leaflet)
library(viridis)
library(dtplyr)
library(dplyr)
library(magrittr)
library(data.table)
library(jsonlite)
library(rgdal)
library(htmlwidgets) # saving leaflet maps
Australia is a particularly challenging country to map. Australia is very large, highly urbanized, with cities far away from each other.
There are other challenges too. The ABS publishes shapefiles of statistical areas, but doesn’t provide an API to access them. Although an API is typically associated with frequently changing, small files, the lack of API creates an extra step in this common task. PSMA does publish shapefiles, and provide an API, but they are separated into states (reflecting the jurisdictions’ ‘ownership’ of the files). The API also has a nonstandard licence, which purports to allow free use, but contains a caveat that, again, complicates the automated production.1
The other challenge comes with general map making, which after a lot of interrupted resolve, I finally achieved thanks to Peter Ellis’s post demonstrating leaflet: https://ellisp.github.io/blog/2017/06/04/military-gdp.
Nonetheless, the first step of producing maps is often the hardest: obtaining the spatial data in the form the mapping software expects.
ASGS packageThe ASGS package contains the shapefiles from the ABS in package form for convenience. To install, download and install from Dropbox using the following chunk. (I would be interested in alternative methods of package distribution.)
if (!requireNamespace("ASGS", quietly = TRUE)) {
message("Attempting install of ASGS (200 MB) from Dropbox. ",
"This should take some minutes to download.")
tempf <- tempfile(fileext = ".tar.gz")
download.file(url = "https://www.dropbox.com/s/ff815l1zkabfb3z/ASGS_0.0.1.tar.gz?dl=1",
destfile = tempf)
install.packages(tempf, type = "source", repos = NULL)
}
library(ASGS)
The goal is to produce chloropleths showing the population distribution by SA2.
First, we need the data itself
ERP_by_SA2 <- fread("https://raw.githubusercontent.com/HughParsonage/ABS-data/master/Estimated-Resident-Population-SA2-2016.txt")
ERP_by_SA2[, SA2_MAIN11 := factor(ASGS_2011)]
plot()You can plot shapefiles immediately using the plot method.
plot(SA2_2011)
And it is fairly straightforward to create a chloropleth by adding colours
#' @return Viridis palette of x by quantile (decile by default).
palette_v <- function(x, n = 10) {
input <- data.table(x = x)
input[, order := 1:.N]
input[, decile := ntile(x, n)]
colortbl <- data.table(decile = 1:10,
colour = viridis(10))
colortbl[input, on = "decile"] %>%
setorder(order) %>%
.[["colour"]]
}
pal_v <- colorNumeric(palette = viridis(10),
domain = NULL)
# Copy shapefile and join with ERP_by_SA2 to create
# a column to fill by.
SA2_2011_join <-
SA2_2011
SA2_2011_join@data <-
inner_join(SA2_2011_join@data, ERP_by_SA2, by = "SA2_MAIN11") %>%
# + 1 in case any 0sqm regions create NaNs
mutate(Density = Value / ((ALBERS_SQM + 1) / 1e6),
Density_Decile = ntile(Density, 10))
plot(SA2_2011_join,
col = palette_v(SA2_2011_join@data$Value),
main = "Population by SA2")
plot(SA2_2011_join,
col = palette_v(SA2_2011_join@data$Density_Decile),
main = "Population density by SA2")
Maps so produced can often be surprisingly ‘sufficient’. I often run dev.copy2pdf(file = "my-map.pdf", width = 200, height = 150) to create a PDF to look closer at the cities. It’s not quite as natural or rich in features as mapps produced via leaflet() are below, but it’s simple, very reliable, and looks great when you zoom in.
If you’re creating a single, static map to be included in a PDF, I would run dev.copy2pdf here, pound LaTeX with trim and clip options to \includegraphics for about 15 minutes, then TikZ the rest. I doubt what I would produce would be worse than doing it the canonical way.
However, if the statistic for creating the colours in the chloropleth are likely to change, or you need more flexibility, you should use ggplot2 with the geom_polygon layer to create the map.
ggplot.NotYetImplemented
Because of the challenges of mapping in Australia, interactive chloropleths have considerable appeal. Leaflet through its eponymous R package offers this feature.
Like with plot, leaflet() accepts shapefiles as inputs, and you can pass shapefiles immediately. However, for the SA2 shapefiles, this results in hefty HTML files. On my machine, RStudio creaked under their weight.
(You should run the following two chunks. I’ve set eval=FALSE to reduce the file size of this page from 250 MB to 25 MB!)
SA2_2011_join %>%
leaflet %>%
addPolygons
The default settings of leaflet are not suitable for presentation; however, leaflet offers enough flexibility to create elegant maps:
SA2_2011_join %>%
leaflet %>%
addPolygons(stroke = TRUE,
opacity = 1,
weight = 1, # the border thickness / pixels
color = "white",
fillColor = ~pal_v(SA2_2011_join@data$Density_Decile),
fillOpacity = 1)
However, the file is too large (125 MB on my computer) to be shared easily. library(ASGS) offers SA2_2011_simple, a simplified version (20%) of SA2_2011 produced from the GUI at http://mapshaper.org/.
Further compression of the HTML file can be achieved by reducing the precision of the coordinates. In contrast to simplifying the shapefile (which reduces the detail of borders), reducing the precision does not change the shape of borders much, but translates them east or north.
The ABS’s standard uses a nominal precision of about 1 mm, which is vastly more precise than needed for SA2 boundaries. By reducing the precision, the raw HTML (which lists the coordinates of every vertex) will have literally use fewer characters to mark each coordinate, shaving off a considerable chunk of the size of your page, improving performance.
I chose a precision of 0.0001 which means the borders can be translated at most 10 metres. Such a modest loss of precision, even for detailed places like inner Sydney and Melbourne, had no visible effect.
The file in the package was created simply by running (again in http://mapshaper.org/)
mapshaper -o precision=0.0001 format=topojson out.json
The resulting chloropleth is 15 MB, which I suspect cannot be improved. (Reducing the precision further resulted in unsightly overlaps at Walsh Bay.)
#' @param x
#' @param digits
#' @return For values over 1000, a comma()'d version of x to \code{digits} significant figures; otherwise,
#' x to \code{digits} significant places.
#'
#' Using \code{comma(signif(x, 2))} would produce '120,000.0' instead of '120,000'.
signif_comma <- function(x, digits = 2) {
y <- signif(x, digits = digits)
out <- y
out[log10(x) > 3] <- scales::comma(y[log10(x) > 3])
out
}
expect_equal(signif_comma(c(1.2, 12, 1200, 12000, 120000)),
c("1.2", "12", "1,200", "12,000", "120,000"))
# Copy shapefile and join with ERP_by_SA2 to create
# a column to fill by.
SA2_2011_simple %>%
leaflet %>%
addPolygons(stroke = TRUE,
opacity = 1,
weight = 1, # the border thickness / pixels
color = "black",
fillColor = ~pal_v(SA2_2011_join@data$Density_Decile),
fillOpacity = 1,
label = lapply(paste0("<b>",
SA2_2011_join@data$SA2_NAME11, ":",
"</b><br>",
signif_comma(SA2_2011_join@data$Density, 2),
"/km²"),
HTML),
highlightOptions = highlightOptions(weight = 2,
color = "white",
opacity = 1,
dashArray = "",
bringToFront = TRUE))